A distributed chunk calculation approach for self-scheduling of parallel applications on distributed-memory systems

نویسندگان

چکیده

Loop scheduling techniques aim to achieve load-balanced executions of scientific applications. Dynamic loop self-scheduling (DLS) libraries for distributed-memory systems are typically MPI-based and employ a centralized chunk calculation approach (CCA) assign variably-sized chunks iterations. We present distributed (DCA) that supports various types DLS techniques. Using both CCA DCA, twelve implemented evaluated in different CPU slowdown scenarios. The results show the using DCA outperform their corresponding ones with CCA, especially extreme system

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Runtime Incremental Parallel Scheduling (RIPS) on Distributed Memory Computers - Parallel and Distributed Systems, IEEE Transactions on

Runtime Incremental Parallel Scheduling (RIPS) is an alternative strategy to the commonly used dynamic scheduling. In this scheduling strategy, the system scheduling activity alternates with the underlying computation work. RIPS utilizes the advanced parallel scheduling technique to produce a low-overhead, high-quality load balancing, as well as adapting to irregular applications. This paper pr...

متن کامل

Chunk: A Framework for Modular Distributed Shared Memory Systems

We present Chunk, a framework for building modular distributed shared memory systems for UNIX. Chunk allows applications that are designed to share local memory through the UNIX memory mapped file mechanism (mmap) to be able to share memory across different physical hosts without modifications. Chunk’s modular architecture enables the use of a variety of memory-sharing policies. We present a DS...

متن کامل

Parallel Loop Scheduling Approaches for Distributed and Shared Memory Systems

In this paper, we propose different approaches for the parallel loop scheduling problem on distributed as well as shared memory systems. Specifically, we propose adaptive loop scheduling models in order to achieve load balancing, low runtime scheduling, low synchronization overhead and low communication overhead. Our models are based on an adaptive determination of the chunk size and an exploit...

متن کامل

Adaptively Scheduling Parallel Loops in Distributed Shared-Memory Systems

Using runtime information of load distributions and processor affinity, we propose an adaptive scheduling algorithm and its variations from different control mechanisms. The proposed algorithm applies different degrees of aggressiveness to adjust loop scheduling granularities, aiming at improving the execution performance of parallel loops by making scheduling decisions that match the real work...

متن کامل

Scalability of Finite Element Applications on Distributed-memory Parallel Computers Scalability of Finite Element Applications on Distributed-memory Parallel Computers

This paper demonstrates that scalability and competitive eeciency can be achieved for unstructured grid nite element applications on distributed memory machines, such as the Connection Machine CM-5 system. The eeciency of nite element solvers is analyzed through two applications: an implicit computational aerodynamics application and an explicit solid mechanics application. Scalability of mesh ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Computational Science

سال: 2021

ISSN: ['1877-7511', '1877-7503']

DOI: https://doi.org/10.1016/j.jocs.2020.101284